Búsqueda | Portal Regional de la BVS

1.

MaskCAE: Masked Convolutional AutoEncoder via Sensor Data Reconstruction for Self-Supervised Human Activity Recognition.

Cheng, Dongzhou; Zhang, Lei; Qin, Lutong; Wang, Shuoyuan; Wu, Hao; Song, Aiguo.

IEEE J Biomed Health Inform ; 28(5): 2687-2698, 2024 May.

Artículo en Inglés | MEDLINE | ID: mdl-38442051

RESUMEN

Self-supervised Human Activity Recognition (HAR) has been gradually gaining a lot of attention in ubiquitous computing community. Its current focus primarily lies in how to overcome the challenge of manually labeling complicated and intricate sensor data from wearable devices, which is often hard to interpret. However, current self-supervised algorithms encounter three main challenges: performance variability caused by data augmentations in contrastive learning paradigm, limitations imposed by traditional self-supervised models, and the computational load deployed on wearable devices by current mainstream transformer encoders. To comprehensively tackle these challenges, this paper proposes a powerful self-supervised approach for HAR from a novel perspective of denoising autoencoder, the first of its kind to explore how to reconstruct masked sensor data built on a commonly employed, well-designed, and computationally efficient fully convolutional network. Extensive experiments demonstrate that our proposed Masked Convolutional AutoEncoder (MaskCAE) outperforms current state-of-the-art algorithms in self-supervised, fully supervised, and semi-supervised situations without relying on any data augmentations, which fills the gap of masked sensor data modeling in HAR area. Visualization analyses show that our MaskCAE could effectively capture temporal semantics in time series sensor data, indicating its great potential in modeling abstracted sensor data. An actual implementation is evaluated on an embedded platform.

Asunto(s)

Algoritmos , Actividades Humanas , Humanos , Actividades Humanas/clasificación , Procesamiento de Señales Asistido por Computador , Dispositivos Electrónicos Vestibles , Aprendizaje Automático Supervisado , Redes Neurales de la Computación

2.

Cross-Attention Enhanced Pyramid Multi-Scale Networks for Sensor-Based Human Activity Recognition.

Pang, Hongsen; Zheng, Li; Fang, Hongbin.

IEEE J Biomed Health Inform ; 28(5): 2733-2744, 2024 May.

Artículo en Inglés | MEDLINE | ID: mdl-38483804

RESUMEN

Human Activity Recognition (HAR) has recently attracted widespread attention, with the effective application of this technology helping people in areas such as healthcare, smart homes, and gait analysis. Deep learning methods have shown remarkable performance in HAR. A pivotal challenge is the trade-off between recognition accuracy and computational efficiency, especially in resource-constrained mobile devices. This challenge necessitates the development of models that enhance feature representation capabilities without imposing additional computational burdens. Addressing this, we introduce a novel HAR model leveraging deep learning, ingeniously designed to navigate the accuracy-efficiency trade-off. The model comprises two innovative modules: 1) Pyramid Multi-scale Convolutional Network (PMCN), which is designed with a symmetric structure and is capable of obtaining a rich receptive field at a finer level through its multiscale representation capability; 2) Cross-Attention Mechanism, which establishes interrelationships among sensor dimensions, temporal dimensions, and channel dimensions, and effectively enhances useful information while suppressing irrelevant data. The proposed model is rigorously evaluated across four diverse datasets: UCI, WISDM, PAMAP2, and OPPORTUNITY. Additional ablation and comparative studies are conducted to comprehensively assess the performance of the model. Experimental results demonstrate that the proposed model achieves superior activity recognition accuracy while maintaining low computational overhead.

Asunto(s)

Aprendizaje Profundo , Actividades Humanas , Humanos , Actividades Humanas/clasificación , Procesamiento de Señales Asistido por Computador , Redes Neurales de la Computación , Algoritmos , Bases de Datos Factuales , Monitoreo Ambulatorio/métodos , Monitoreo Ambulatorio/instrumentación

3.

Human Interaction Understanding With Joint Graph Decomposition and Node Labeling.

Wang, Zhenhua; Ge, Jinchao; Guo, Dongyan; Zhang, Jianhua; Lei, Yanjing; Chen, Shengyong.

IEEE Trans Image Process ; 30: 6240-6254, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-34224352

RESUMEN

The task of human interaction understanding involves both recognizing the action of each individual in the scene and decoding the interaction relationship among people, which is useful to a series of vision applications such as camera surveillance, video-based sports analysis and event retrieval. This paper divides the task into two problems including grouping people into clusters and assigning labels to each of them, and presents an approach to solving these problems in a joint manner. Our method does not assume the number of groups is known beforehand as this will substantially restrict its application. With the observation that the two challenges are highly correlated, the key idea is to model the pairwise interacting relations among people via a complete graph and its associated energy function such that the labeling and grouping problems are translated into the minimization of the energy function. We implement this joint framework by fusing both deep features and rich contextual cues, and learn the fusion parameters from data. An alternating search algorithm is developed in order to efficiently solve the associated inference problem. By combining the grouping and labeling results obtained with our method, we are able to achieve the semantic-level understanding of human interactions. Extensive experiments are performed to qualitatively and quantitatively evaluate the effectiveness of our approach, which outperforms state-of-the-art methods on several important benchmarks. An ablation study is also performed to verify the effectiveness of different modules within our approach.

Asunto(s)

Actividades Humanas/clasificación , Procesamiento de Imagen Asistido por Computador/métodos , Aprendizaje Automático , Algoritmos , Humanos , Grabación en Video

4.

IPGN: Interactiveness Proposal Graph Network for Human-Object Interaction Detection.

Wang, Haoran; Jiao, Licheng; Liu, Fang; Li, Lingling; Liu, Xu; Ji, Deyi; Gan, Weihao.

IEEE Trans Image Process ; 30: 6583-6593, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-34270424

RESUMEN

Human-Object Interaction (HOI) Detection is an important task to understand how humans interact with objects. Most of the existing works treat this task as an exhaustive triplet ã human, verb, object ã classification problem. In this paper, we decompose it and propose a novel two-stage graph model to learn the knowledge of interactiveness and interaction in one network, namely, Interactiveness Proposal Graph Network (IPGN). In the first stage, we design a fully connected graph for learning the interactiveness, which distinguishes whether a pair of human and object is interactive or not. Concretely, it generates the interactiveness features to encode high-level semantic interactiveness knowledge for each pair. The class-agnostic interactiveness is a more general and simpler objective, which can be used to provide reasonable proposals for the graph construction in the second stage. In the second stage, a sparsely connected graph is constructed with all interactive pairs selected by the first stage. Specifically, we use the interactiveness knowledge to guide the message passing. By contrast with the feature similarity, it explicitly represents the connections between the nodes. Benefiting from the valid graph reasoning, the node features are well encoded for interaction learning. Experiments show that the proposed method achieves state-of-the-art performance on both V-COCO and HICO-DET datasets.

Asunto(s)

Procesamiento de Imagen Asistido por Computador/métodos , Redes Neurales de la Computación , Algoritmos , Animales , Bases de Datos Factuales , Actividades Humanas/clasificación , Humanos , Semántica

5.

Attend and Guide (AG-Net): A Keypoints-Driven Attention-Based Deep Network for Image Recognition.

Bera, Asish; Wharton, Zachary; Liu, Yonghuai; Bessis, Nik; Behera, Ardhendu.

IEEE Trans Image Process ; 30: 3691-3704, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-33705316

RESUMEN

This article presents a novel keypoints-based attention mechanism for visual recognition in still images. Deep Convolutional Neural Networks (CNNs) for recognizing images with distinctive classes have shown great success, but their performance in discriminating fine-grained changes is not at the same level. We address this by proposing an end-to-end CNN model, which learns meaningful features linking fine-grained changes using our novel attention mechanism. It captures the spatial structures in images by identifying semantic regions (SRs) and their spatial distributions, and is proved to be the key to modeling subtle changes in images. We automatically identify these SRs by grouping the detected keypoints in a given image. The "usefulness" of these SRs for image recognition is measured using our innovative attentional mechanism focusing on parts of the image that are most relevant to a given task. This framework applies to traditional and fine-grained image recognition tasks and does not require manually annotated regions (e.g. bounding-box of body parts, objects, etc.) for learning and prediction. Moreover, the proposed keypoints-driven attention mechanism can be easily integrated into the existing CNN models. The framework is evaluated on six diverse benchmark datasets. The model outperforms the state-of-the-art approaches by a considerable margin using Distracted Driver V1 (Acc: 3.39%), Distracted Driver V2 (Acc: 6.58%), Stanford-40 Actions (mAP: 2.15%), People Playing Musical Instruments (mAP: 16.05%), Food-101 (Acc: 6.30%) and Caltech-256 (Acc: 2.59%) datasets.

Asunto(s)

Aprendizaje Profundo , Actividades Humanas/clasificación , Procesamiento de Imagen Asistido por Computador/métodos , Femenino , Humanos , Masculino , Semántica

6.

Multitask Non-Autoregressive Model for Human Motion Prediction.

Li, Bin; Tian, Jian; Zhang, Zhongfei; Feng, Hailin; Li, Xi.

IEEE Trans Image Process ; 30: 2562-2574, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-33232232

RESUMEN

Human motion prediction, which aims at predicting future human skeletons given the past ones, is a typical sequence-to-sequence problem. Therefore, extensive efforts have been devoted to exploring different RNN-based encoder-decoder architectures. However, by generating target poses conditioned on the previously generated ones, these models are prone to bringing issues such as error accumulation problem. In this paper, we argue that such issue is mainly caused by adopting autoregressive manner. Hence, a novel Non-AuToregressive model (NAT) is proposed with a complete non-autoregressive decoding scheme, as well as a context encoder and a positional encoding module. More specifically, the context encoder embeds the given poses from temporal and spatial perspectives. The frame decoder is responsible for predicting each future pose independently. The positional encoding module injects positional signal into the model to indicate the temporal order. Besides, a multitask training paradigm is presented for both low-level human skeleton prediction and high-level human action recognition, resulting in the considerable improvement for the prediction task. Our approach is evaluated on Human3.6M and CMU-Mocap benchmarks and outperforms state-of-the-art autoregressive methods.

Asunto(s)

Procesamiento de Imagen Asistido por Computador/métodos , Aprendizaje Automático , Movimiento/fisiología , Actividades Humanas/clasificación , Humanos , Intención , Modelos Estadísticos , Grabación en Video

7.

Human activity recognition using magnetic induction-based motion signals and deep recurrent neural networks.

Golestani, Negar; Moghaddam, Mahta.

Nat Commun ; 11(1): 1551, 2020 03 25.

Artículo en Inglés | MEDLINE | ID: mdl-32214095

RESUMEN

Recognizing human physical activities using wireless sensor networks has attracted significant research interest due to its broad range of applications, such as healthcare, rehabilitation, athletics, and senior monitoring. There are critical challenges inherent in designing a sensor-based activity recognition system operating in and around a lossy medium such as the human body to gain a trade-off among power consumption, cost, computational complexity, and accuracy. We introduce an innovative wireless system based on magnetic induction for human activity recognition to tackle these challenges and constraints. The magnetic induction system is integrated with machine learning techniques to detect a wide range of human motions. This approach is successfully evaluated using synthesized datasets, laboratory measurements, and deep recurrent neural networks.

Asunto(s)

Aprendizaje Profundo , Actividades Humanas/clasificación , Fenómenos Magnéticos , Monitoreo Fisiológico/métodos , Procesamiento de Señales Asistido por Computador , Humanos , Movimiento (Física) , Dispositivos Electrónicos Vestibles , Tecnología Inalámbrica

8.

Radial Basis Function Neural Network with Localized Stochastic-Sensitive Autoencoder for Home-Based Activity Recognition.

Ng, Wing W Y; Xu, Shichao; Wang, Ting; Zhang, Shuai; Nugent, Chris.

Sensors (Basel) ; 20(5)2020 Mar 08.

Artículo en Inglés | MEDLINE | ID: mdl-32182668

RESUMEN

Over the past few years, the Internet of Things (IoT) has been greatly developed with one instance being smart home devices gradually entering into people's lives. To maximize the impact of such deployments, home-based activity recognition is required to initially recognize behaviors within smart home environments and to use this information to provide better health and social care services. Activity recognition has the ability to recognize people's activities from the information about their interaction with the environment collected by sensors embedded within the home. In this paper, binary data collected by anonymous binary sensors such as pressure sensors, contact sensors, passive infrared sensors etc. are used to recognize activities. A radial basis function neural network (RBFNN) with localized stochastic-sensitive autoencoder (LiSSA) method is proposed for the purposes of home-based activity recognition. An autoencoder (AE) is introduced to extract useful features from the binary sensor data by converting binary inputs into continuous inputs to extract increased levels of hidden information. The generalization capability of the proposed method is enhanced by minimizing both the training error and the stochastic sensitivity measure in an attempt to improve the ability of the classifier to tolerate uncertainties in the sensor data. Four binary home-based activity recognition datasets including OrdonezA, OrdonezB, Ulster, and activities of daily living data from van Kasteren (vanKasterenADL) are used to evaluate the effectiveness of the proposed method. Compared with well-known benchmarking approaches including support vector machine (SVM), multilayer perceptron neural network (MLPNN), random forest and an RBFNN-based method, the proposed method yielded the best performance with 98.35%, 86.26%, 96.31%, 92.31% accuracy on four datasets, respectively.

Asunto(s)

Actividades Humanas/clasificación , Monitoreo Ambulatorio/métodos , Red Nerviosa , Adulto , Servicios de Atención de Salud a Domicilio , Humanos , Internet de las Cosas , Masculino , Procesos Estocásticos , Máquina de Vectores de Soporte

9.

Implicit Irregularity Detection Using Unsupervised Learning on Daily Behaviors.

Shang, Cuijuan; Chang, Chih-Yung; Chen, Guilin; Zhao, Shenghui; Lin, Jiazao.

IEEE J Biomed Health Inform ; 24(1): 131-143, 2020 01.

Artículo en Inglés | MEDLINE | ID: mdl-30716055

RESUMEN

The irregularity detection of daily behaviors for the elderly is an important issue in homecare. Plenty of mechanisms have been developed to detect the health condition of the elderly based on the explicit irregularity of several biomedical parameters or some specific behaviors. However, few research works focus on detecting the implicit irregularity involving the combination of diverse behaviors, which can assess the cognitive and physical wellbeing of elders but cannot be directly identified based on sensor data. This paper proposes an Implicit IRregularity Detection (IIRD) mechanism that aims to detect the implicit irregularity by developing the unsupervised learning algorithm based on daily behaviors. The proposed IIRD mechanism identifies the distance and similarity between daily behaviors, which are important features to distinguish the regular and irregular daily behaviors and detect the implicit irregularity of elderly health condition. Performance results show that the proposed IIRD outperforms the existing unsupervised machine-learning mechanisms in terms of the detection accuracy and irregularity recall.

Asunto(s)

Servicios de Atención de Salud a Domicilio , Actividades Humanas/clasificación , Aprendizaje Automático no Supervisado , Anciano , Algoritmos , Bases de Datos Factuales , Humanos , Monitoreo Fisiológico

10.

Learning Compact Features for Human Activity Recognition Via Probabilistic First-Take-All.

Ye, Jun; Qi, Guo-Jun; Zhuang, Naifan; Hu, Hao; Hua, Kien A.

IEEE Trans Pattern Anal Mach Intell ; 42(1): 126-139, 2020 01.

Artículo en Inglés | MEDLINE | ID: mdl-30296212

RESUMEN

With the popularity of mobile sensor technology, smart wearable devices open a unprecedented opportunity to solve the challenging human activity recognition (HAR) problem by learning expressive representations from the multi-dimensional daily sensor signals. This inspires us to develop a new algorithm applicable to both camera-based and wearable sensor-based HAR systems. Although competitive classification accuracy has been reported, existing methods often face the challenge of distinguishing visually similar activities composed of activity patterns in different temporal orders. In this paper, we propose a novel probabilistic algorithm to compactly encode temporal orders of activity patterns for HAR. Specifically, the algorithm learns an optimal set of latent patterns such that their temporal structures really matter in recognizing different human activities. Then, a novel probabilistic First-Take-All (pFTA) approach is introduced to generate compact features from the orders of these latent patterns to encode the entire sequence, and the temporal structural similarity between different sequences can be efficiently measured by the Hamming distance between compact features. Experiments on three public HAR datasets show the proposed pFTA approach can achieve competitive performance in terms of accuracy as well as efficiency.

Asunto(s)

Actividades Humanas/clasificación , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Bases de Datos Factuales , Humanos , Procesamiento de Imagen Asistido por Computador , Modelos Estadísticos , Grabación en Video , Dispositivos Electrónicos Vestibles

11.

Context-Aware Delivery of Ecological Momentary Assessment.

Aminikhanghahi, Samaneh; Schmitter-Edgecombe, Maureen; Cook, Diane J.

IEEE J Biomed Health Inform ; 24(4): 1206-1214, 2020 04.

Artículo en Inglés | MEDLINE | ID: mdl-31443058

RESUMEN

Ecological Momentary Assessment (EMA) is an in-the-moment data collection method which avoids retrospective biases and maximizes ecological validity. A challenge in designing EMA systems is finding a time to ask EMA questions that increases participant engagement and improves the quality of data collection. In this work, we introduce SEP-EMA, a machine learning-based method for providing transition-based context-aware EMA prompt timings. We compare our proposed technique with traditional time-based prompting for 19 individuals living in smart homes. Results reveal that SEP-EMA increased participant response rate by 7.19% compared to time-based prompting. Our findings suggest that prompting during activity transitions makes the EMA process more usable and effective by increasing EMA response rates and mitigating loss of data due to low response rates.

Asunto(s)

Algoritmos , Evaluación Ecológica Momentánea , Actividades Humanas/clasificación , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Reconocimiento de Normas Patrones Automatizadas , Estudios Retrospectivos , Factores de Tiempo

12.

NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding.

Liu, Jun; Shahroudy, Amir; Perez, Mauricio; Wang, Gang; Duan, Ling-Yu; Kot, Alex C.

IEEE Trans Pattern Anal Mach Intell ; 42(10): 2684-2701, 2020 10.

Artículo en Inglés | MEDLINE | ID: mdl-31095476

RESUMEN

Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding.

Asunto(s)

Aprendizaje Profundo , Actividades Humanas/clasificación , Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Benchmarking , Humanos , Semántica , Grabación en Video

13.

Evaluating and Enhancing the Generalization Performance of Machine Learning Models for Physical Activity Intensity Prediction From Raw Acceleration Data.

Farrahi, Vahid; Niemela, Maisa; Tjurin, Petra; Kangas, Maarit; Korpelainen, Raija; Jamsa, Timo.

IEEE J Biomed Health Inform ; 24(1): 27-38, 2020 01.

Artículo en Inglés | MEDLINE | ID: mdl-31107668

RESUMEN

PURPOSE: To evaluate and enhance the generalization performance of machine learning physical activity intensity prediction models developed with raw acceleration data on populations monitored by different activity monitors. METHOD: Five datasets from four studies, each containing only hip- or wrist-based raw acceleration data (two hip- and three wrist-based) were extracted. The five datasets were then used to develop and validate artificial neural networks (ANN) in three setups to classify activity intensity categories (sedentary behavior, light, and moderate-to-vigorous). To examine generalizability, the ANN models were developed using within dataset (leave-one-subject-out) cross validation, and then cross tested to other datasets with different accelerometers. To enhance the models' generalizability, a combination of four of the five datasets was used for training and the fifth dataset for validation. Finally, all the five datasets were merged to develop a single model that is generalizable across the datasets (50% of the subjects from each dataset for training, the remaining for validation). RESULTS: The datasets showed high performance in within dataset cross validation (accuracy 71.9-95.4%, Kappa K = 0.63-0.94). The performance of the within dataset validated models decreased when applied to datasets with different accelerometers (41.2-59.9%, K = 0.21-0.48). The trained models on merged datasets consisting hip and wrist data predicted the left-out dataset with acceptable performance (65.9-83.7%, K = 0.61-0.79). The model trained with all five datasets performed with acceptable performance across the datasets (80.4-90.7%, K = 0.68-0.89). CONCLUSIONS: Integrating heterogeneous datasets in training sets seems a viable approach for enhancing the generalization performance of the models. Instead, within dataset validation is not sufficient to understand the models' performance on other populations with different accelerometers.

Asunto(s)

Acelerometría/métodos , Ejercicio Físico/fisiología , Aprendizaje Automático , Reconocimiento de Normas Patrones Automatizadas/métodos , Adulto , Bases de Datos Factuales , Actividades Humanas/clasificación , Humanos , Modelos Estadísticos , Monitoreo Fisiológico , Redes Neurales de la Computación

14.

TSE-CNN: A Two-Stage End-to-End CNN for Human Activity Recognition.

Huang, Jiahui; Lin, Shuisheng; Wang, Ning; Dai, Guanghai; Xie, Yuxiang; Zhou, Jun.

IEEE J Biomed Health Inform ; 24(1): 292-299, 2020 01.

Artículo en Inglés | MEDLINE | ID: mdl-30969934

RESUMEN

Human activity recognition has been widely used in healthcare applications such as elderly monitoring, exercise supervision, and rehabilitation monitoring. Compared with other approaches, sensor-based wearable human activity recognition is less affected by environmental noise and therefore is promising in providing higher recognition accuracy. However, one of the major issues of existing wearable human activity recognition methods is that although the average recognition accuracy is acceptable, the recognition accuracy for some activities (e.g., ascending stairs and descending stairs) is low, mainly due to relatively less training data and complex behavior pattern for these activities. Another issue is that the recognition accuracy is low when the training data from the test subject are limited, which is a common case in real practice. In addition, the use of neural network leads to large computational complexity and thus high power consumption. To address these issues, we proposed a new human activity recognition method with two-stage end-to-end convolutional neural network and a data augmentation method. Compared with the state-of-the-art methods (including neural network based methods and other methods), the proposed methods achieve significantly improved recognition accuracy and reduced computational complexity.

Asunto(s)

Actividades Humanas/clasificación , Movimiento/fisiología , Redes Neurales de la Computación , Acelerometría/métodos , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad , Monitoreo Ambulatorio/métodos , Reconocimiento de Normas Patrones Automatizadas , Dispositivos Electrónicos Vestibles , Adulto Joven

15.

Force from Motion: Decoding Control Force of Activity in a First-Person Video.

Park, Hyun Soo; Shi, Jianbo.

IEEE Trans Pattern Anal Mach Intell ; 42(3): 622-635, 2020 03.

Artículo en Inglés | MEDLINE | ID: mdl-30489262

RESUMEN

A first-person video delivers what the camera wearer (actor) experiences through physical interactions with surroundings. In this paper, we focus on a problem of Force from Motion-estimating the active force and torque exerted by the actor to drive her/his activity-from a first-person video. We use two physical cues inherited in the first-person video. (1) Ego-motion: the camera motion is generated by a resultant of force interactions, which allows us to understand the effect of the active force using Newtonian mechanics. (2) Visual semantics: the first-person visual scene is deployed to afford the actor's activity, which is indicative of the physical context of the activity. We estimate the active force and torque using a dynamical system that can describe the transition (dynamics) of the actor's physical state (position, orientation, and linear/angular momentum) where the latent physical state is indirectly observed by the first-person video. We approximate the physical state with the 3D camera trajectory that is reconstructed up to scale and orientation. The absolute scale factor and gravitation field are learned from the ego-motion and visual semantics of the first-person video. Inspired by an optimal control theory, we solve the dynamical system by minimizing reprojection error. Our method shows quantitatively equivalent reconstruction comparing to IMU measurements in terms of gravity and scale recovery and outperforms the methods based on 2D optical flow for an active action recognition task. We apply our method to first-person videos of mountain biking, urban bike racing, skiing, speedflying with parachute, and wingsuit flying where inertial measurements are not accessible.

Asunto(s)

Actividades Humanas/clasificación , Procesamiento de Imagen Asistido por Computador/métodos , Movimiento/fisiología , Grabación en Video/métodos , Aceleración , Humanos , Deportes

16.

A Semisupervised Recurrent Convolutional Attention Model for Human Activity Recognition.

Chen, Kaixuan; Yao, Lina; Zhang, Dalin; Wang, Xianzhi; Chang, Xiaojun; Nie, Feiping.

IEEE Trans Neural Netw Learn Syst ; 31(5): 1747-1756, 2020 05.

Artículo en Inglés | MEDLINE | ID: mdl-31329134

RESUMEN

Recent years have witnessed the success of deep learning methods in human activity recognition (HAR). The longstanding shortage of labeled activity data inherently calls for a plethora of semisupervised learning methods, and one of the most challenging and common issues with semisupervised learning is the imbalanced distribution of labeled data over classes. Although the problem has long existed in broad real-world HAR applications, it is rarely explored in the literature. In this paper, we propose a semisupervised deep model for imbalanced activity recognition from multimodal wearable sensory data. We aim to address not only the challenges of multimodal sensor data (e.g., interperson variability and interclass similarity) but also the limited labeled data and class-imbalance issues simultaneously. In particular, we propose a pattern-balanced semisupervised framework to extract and preserve diverse latent patterns of activities. Furthermore, we exploit the independence of multi-modalities of sensory data and attentively identify salient regions that are indicative of human activities from inputs by our recurrent convolutional attention networks. Our experimental results demonstrate that the proposed model achieves a competitive performance compared to a multitude of state-of-the-art methods, both semisupervised and supervised ones, with 10% labeled training data. The results also show the robustness of our method over imbalanced, small training data sets.

Asunto(s)

Actividades Humanas/clasificación , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/clasificación , Reconocimiento de Normas Patrones Automatizadas/métodos , Aprendizaje Automático Supervisado/clasificación , Humanos

17.

Moments in Time Dataset: One Million Videos for Event Understanding.

Monfort, Mathew; Andonian, Alex; Zhou, Bolei; Ramakrishnan, Kandan; Bargal, Sarah Adel; Yan, Tom; Brown, Lisa; Fan, Quanfu; Gutfreund, Dan; Vondrick, Carl; Oliva, Aude.

IEEE Trans Pattern Anal Mach Intell ; 42(2): 502-508, 2020 02.

Artículo en Inglés | MEDLINE | ID: mdl-30802849

RESUMEN

We present the Moments in Time Dataset, a large-scale human-annotated collection of one million short videos corresponding to dynamic events unfolding within three seconds. Modeling the spatial-audio-temporal dynamics even for actions occurring in 3 second videos poses many challenges: meaningful events do not include only people, but also objects, animals, and natural phenomena; visual and auditory events can be symmetrical in time ("opening" is "closing" in reverse), and either transient or sustained. We describe the annotation process of our dataset (each video is tagged with one action or activity label among 339 different classes), analyze its scale and diversity in comparison to other large-scale video datasets for action recognition, and report results of several baseline models addressing separately, and jointly, three modalities: spatial, temporal and auditory. The Moments in Time dataset, designed to have a large coverage and diversity of events in both visual and auditory modalities, can serve as a new challenge to develop models that scale to the level of complexity and abstract reasoning that a human processes on a daily basis.

Asunto(s)

Bases de Datos Factuales , Grabación en Video , Animales , Actividades Humanas/clasificación , Humanos , Procesamiento de Imagen Asistido por Computador , Reconocimiento de Normas Patrones Automatizadas

18.

Standardised criteria for classifying the International Classification of Activities for Time-use Statistics (ICATUS) activity groups into sleep, sedentary behaviour, and physical activity.

Liangruenrom, Nucharapon; Craike, Melinda; Dumuid, Dorothea; Biddle, Stuart J H; Tudor-Locke, Catrine; Ainsworth, Barbara; Jalayondeja, Chutima; van Tienoven, Theun Pieter; Lachapelle, Ugo; Weenas, Djiwo; Berrigan, David; Olds, Timothy; Pedisic, Zeljko.

Int J Behav Nutr Phys Act ; 16(1): 106, 2019 11 14.

Artículo en Inglés | MEDLINE | ID: mdl-31727080

RESUMEN

BACKGROUND: Globally, the International Classification of Activities for Time-Use Statistics (ICATUS) is one of the most widely used time-use classifications to identify time spent in various activities. Comprehensive 24-h activities that can be extracted from ICATUS provide possible implications for the use of time-use data in relation to activity-health associations; however, these activities are not classified in a way that makes such analysis feasible. This study, therefore, aimed to develop criteria for classifying ICATUS activities into sleep, sedentary behaviour (SB), light physical activity (LPA), and moderate-to-vigorous physical activity (MVPA), based on expert assessment. METHOD: We classified activities from the Trial ICATUS 2005 and final ICATUS 2016. One author assigned METs and codes for wakefulness status and posture, to all subclass activities in the Trial ICATUS 2005. Once coded, one author matched the most detailed level of activities from the ICATUS 2016 with the corresponding activities in the Trial ICATUS 2005, where applicable. The assessment and harmonisation of each ICATUS activity were reviewed independently and anonymously by four experts, as part of a Delphi process. Given a large number of ICATUS activities, four separate Delphi panels were formed for this purpose. A series of Delphi survey rounds were repeated until a consensus among all experts was reached. RESULTS: Consensus about harmonisation and classification of ICATUS activities was reached by the third round of the Delphi survey in all four panels. A total of 542 activities were classified into sleep, SB, LPA, and MVPA categories. Of these, 390 activities were from the Trial ICATUS 2005 and 152 activities were from the final ICATUS 2016. The majority of ICATUS 2016 activities were harmonised into the ICATUS activity groups (n = 143). CONCLUSIONS: Based on expert consensus, we developed a classification system that enables ICATUS-based time-use data to be classified into sleep, SB, LPA, and MVPA categories. Adoption and consistent use of this classification system will facilitate standardisation of time-use data processing for the purpose of sleep, SB and physical activity research, and improve between-study comparability. Future studies should test the applicability of the classification system by applying it to empirical data.

Asunto(s)

Ejercicio Físico , Actividades Humanas/clasificación , Conducta Sedentaria , Sueño/fisiología , Encuestas y Cuestionarios/normas , Humanos

19.

Use of national-scale data to examine human-mediated additions of heavy metals to wetland soils of the US.

Nahlik, Amanda M; Blocksom, Karen A; Herlihy, Alan T; Kentula, Mary E; Magee, Teresa K; Paulsen, Steven G.

Environ Monit Assess ; 191(Suppl 1): 336, 2019 Jun 20.

Artículo en Inglés | MEDLINE | ID: mdl-31222398

RESUMEN

Soil concentrations of 12 heavy metals that have been linked to various anthropogenic activities were measured in samples collected from the uppermost horizon in approximately 1000 wetlands across the conterminous US as part of the 2011 National Wetland Condition Assessment (NWCA). The heavy metals were silver (Ag), cadmium (Cd), cobalt (Co), chromium (Cr), copper (Cu), nickel (Ni), lead (Pb), antimony (Sb), tin (Sn), vanadium (V), tungsten (W), and zinc (Zn). Using thresholds to distinguish natural background concentrations from human-mediated additions, we evaluated wetland soil heavy metal concentrations in the conterminous US and four regions using a Heavy Metal Index (HMI) that reflects human-mediated heavy metal loads based on the number of elements above expected background concentration. We also examined the individual elements to detect concentrations of heavy metals above expected background that frequently occur in wetland soils. Our data show that wetland soils of the conterminous US typically have low heavy metal loads, and that most of the measured elements occur nationally in concentrations below thresholds that relate to anthropogenic activities. However, we found that soil lead is more common in wetland soils than other measured elements, occurring nationally in 11.3% of the wetland area in concentrations above expected natural background (> 35 ppm). Our data show positive relationships between soil lead concentration and four individual landscape metrics: road density, percent impervious surface, housing unit density, and population density in a 1-km radius buffer area surrounding a site. These relationships, while evident on a national level, are strongest in the eastern US, where the highest road densities and greatest population densities occur. Because lead can be strongly bound to wetland soils in particular, maintenance of the good condition of our nation's wetlands is likely to minimize risk of lead mobilization.

Asunto(s)

Monitoreo del Ambiente/métodos , Actividades Humanas , Metales Pesados/análisis , Contaminantes del Suelo/análisis , Humedales , Monitoreo del Ambiente/estadística & datos numéricos , Actividades Humanas/clasificación , Actividades Humanas/estadística & datos numéricos , Humanos , Factores de Riesgo , Estados Unidos

20.

Quantifying the extent of human disturbance activities and anthropogenic stressors in wetlands across the conterminous United States: results from the National Wetland Condition Assessment.

Lomnicky, Gregg A; Herlihy, Alan T; Kaufmann, Philip R.

Environ Monit Assess ; 191(Suppl 1): 324, 2019 Jun 20.

Artículo en Inglés | MEDLINE | ID: mdl-31222443

RESUMEN

In 2011, the U.S. Environmental Protection Agency conducted the National Wetland Condition Assessment (NWCA) as part of the National Aquatic Resource Survey (NARS) program to determine the condition of wetlands across the 48 contiguous states of the United States (US). Sites were selected using a generalized random tessellated stratified (GRTS) probability design. We quantified the types, extent, and magnitude of human activities as indicators of potential stress on a sample of 1138 wetland sites representing a target population of 251,546 km2 of wetlands in the US. We used field observations of the presence and proximity of more than 50 pre-determined types of human activity to define two types of indices that quantify human influences on wetlands. We grouped these observations into five types of human activity (classes) and summed them within and across these classes to define five metrics and an overall Human Disturbance Activity Index (HDAI). We calculated six Anthropogenic Stress Indices (ASIs) by summing human disturbance activity observations within stressor categories according to their expected effect on each of six aspects of wetland condition. Based on repeat-visit data, the precision of these metrics and indices was sufficient for regional and national assessments. Among the six categories of stress assessed nationally, the percentage of wetland area having ASI levels indicating high stress levels ranged from 10% due to filling/erosional activities to 27% due to vegetation removal activities. The proportion of wetland area with no signs of human disturbance activity (HDAI = 0) within a 140-m diameter area varied widely among the different wetland ecoregions/types we assessed. No visible human disturbance activity was evident in 70% of estuarine wetlands, but among non-estuarine wetlands, only 8% of the wetland area in the West, 15% of the Interior Plains, 22% of the Coastal Plains, and 36% of the Eastern Mountains and Upper Midwest lacked visible evidence of disturbance. The woody wetlands of the West were the most highly stressed reporting group, with more than 75% of their wetland area subject to high levels of ditching, hardening, and vegetation removal. The NWCA offers a unique opportunity to quantify the type, intensity, and extent of human activities in and around wetlands and to assess their likely stress on wetland ecological functions, physical integrity, and overall condition at regional and continental scales.

Asunto(s)

Monitoreo del Ambiente/estadística & datos numéricos , Actividades Humanas/estadística & datos numéricos , United States Environmental Protection Agency/estadística & datos numéricos , Humedales , Recolección de Datos , Ambiente , Actividades Humanas/clasificación , Humanos , Desarrollo de la Planta , Factores de Riesgo , Estados Unidos , United States Environmental Protection Agency/organización & administración

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA